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Background 


Many  tumour  cells  have  lost  the  ability  to  undergo  apoptosis  in  response  to 
DNA  damage  caused  either  by  irradiation  or  chemical  mutagens.  This  appears 
to  be  one  of  the  key  reasons  that  so  many  tumours  are  resistant  to  treatment  by 
either  radio-  or  chemotherapy.  Understanding  how  apoptosis  is  regulated  in 
normal  cells  may  shed  light  on  the  mutations  that  render  tumour  cells  resistant 
to  apoptotic  stimuli  and  may  thus  suggest  novel  therapeutic  strategies.  The  key 
focus  of  this  project  is  to  understand  how  a  particular  conserved  family  of 
regulators  of  apoptosis,  the  Inhibitor  of  Apoptosis  Proteins  (IAPs),  suppresses 
cell  death. 

IAPs  suppress  apoptosis  both  in  the  fruit  fly  Drosophila  melanogaster  and  in 
vertebrates.  All  IAPs  contain  at  least  a  single  copy  of  a  highly  conserved  domain, 
the  Baculovirus  IAP  Repeat  (BIR)  domain;  this  is  essential  for  their  anti-apoptotic 
activity.  Understanding  how  a  BIR  domain  regulates  apoptosis  is  thus  an 
important  step  in  furthering  our  understanding  of  the  molecular  mechanisms  of 
apoptosis. 

The  nematode  worm  C.  elegans  has  been  a  key  tool  for  research  into  apoptosis 
since  the  inception  of  the  field.  Genetic  analysis  of  programmed  cell  death  in  C. 
elegans  led  to  the  identification  of  the  basal  machinery  of  cell  death  which  is 
conserved  between  worms  and  humans.  Analysis  of  the  function  of  IAPs  in  C. 
elegans  might  shed  light  on  the  function  of  IAPs  in  human  cells.  I  have 
previously  identified  two  BIR-containing  Proteins  (BIRPs)  in  the  nematode  worm 
C.  elegans.  One  of  these,  BIR-1,  appears  to  play  no  role  in  the  regulation  of 
programmed  cell  death  in  C.  elegans;  however,  BIR-1  is  required  for  the 
completion  of  cytokinesis.  Furthermore,  I  demonstrated  that  a  human 
homologue  of  BIR-1,  the  BIRP  survivin,  can  partially  substitute  for  BIR-1  in  the 
nematode;  this  shows  that  BIRPs  have  a  conserved  role  in  the  regulation  of 
cytokinesis.  These  results  have  subsequently  been  extended  by  other  groups 
who  have  shown  that  both  fission  and  budding  yeasts  contain  BIRPs  and  that  in 


each  case  inhibition  of  these  genes  leads  to  profound  cell  cycle  defects  and 
polyploidy. 

The  BIR  domain  thus  seems  to  have  a  role  in  cytokinesis  in  eukaryotic  cells 
ranging  from  single-celled  yeasts  to  human  cells.  My  focus  has  been  to  use  C. 
elegans  to  understand  the  function  BIRPs  in  cytokinesis:  how  they  are  regulated, 
what  precise  functional  role  they  have,  and  which  proteins  do  they  interact  with. 

Experimental  Approach 

Cytokinesis  is  a  complex  process  that  is  likely  to  be  highly  regulated  and  to 
involve  a  large  number  of  proteins.  My  approach  to  understanding  how  BIR-1 
functions  in  cytokinesis  has  been  to  try  to  identify  other  genes  that  are  also 
required  for  cytokinesis  in  C.  elegans.  In  this  way  I  hope  to  be  able  to  build  up  a 
more  comprehensive  view  of  the  cytokinesis  machinery  and  subsequently  to 
attempt  to  understand  how  BIR-1  relates  to  this  machinery.  This  approach  has 
recently  been  used  very  productively  by  the  lab  of  Bob  Horvitz,  who  has  shown 
that  inhibition  of  Aurora-like  kinase  activity  leads  to  a  very  similar  defect  in 
cytokinesis  to  that  seen  following  inhibition  of  BIR-1  and,  following  this 
observation,  that  BIR-1  appears  required  for  the  localization  of  Aurora-like 
kinase  to  the  cytokinesis  furrow  and  mid-body. 

Rather  than  examine  individual  candidate  genes  to  determine  whether  they  have 
a  role  in  cytokinesis,  I  wished  to  carry  out  a  genome-wide  screen  to  identify  all 
genes  that  are  required  for  cytokinesis;  to  do  this  I  have  made  use  of  RNA- 
mediated  inhibition  (RNAi).  RNAi  is  a  technique  whereby  the  activity  of  a 
particular  gene  is  transiently  inhibited  following  the  introduction  of  dsRNA  of 
sequence  specific  to  the  targeted  gene.  The  specificity  and  potency  of  RNAi 
make  it  an  ideal  technique  to  investigate  gene  function  beginning  only  with 
genomic  sequence.  Ingestion  of  dsRNA-expressing  bacteria  results  in  RNAi  of 
the  targeted  gene  and  we  have  established  that  this  technique  is  at  least  as 
effective  as  the  injection  of  dsRNA  for  RNAi.  It  is  thus  possible  to  make  a  library 
of  bacteria,  each  expressing  dsRNA  corresponding  to  an  individual  gene,  to 
target  each  and  every  predicted  gene  in  the  C.  elegans  genome. 


I  have  made  such  a  library  for  all  genes  on  Chromosome  I  (-13%  of  all  predicted 
genes)  and  have  screened  these  genes  for  RNAi  phenotypes.  I  have  identified 
339  genes  with  RNAi  phenotypes  of  which  221  are  embryonic  lethal  (as  would  be 
expected  for  a  gene  with  a  defect  in  cytokinesis).  This  analysis  of  chromosome  I 
is  the  first  systematic  reverse  genetic  analysis  of  a  multicellular  organism  and  has 
resulted  in  a  Nature  Article. 

To  further  characterise  the  nature  of  the  defect  arising  from  RNAi  of  each  of  the 
221  genes  that  are  required  for  embryonic  viability,  our  lab  is  currently  in  the 
process  of  making  time-lapse  movies  of  embryos  for  all  221  embryonic  lethal 
genes  to  determine  the  nature  of  the  defect  that  gives  rise  to  embryonic  lethality. 
Thus  far  we  have  identified  several  genes  that  are  required  for  cytokinesis. 
These  include  profilin,  an  actin-binding  protein;  a  gene  encoding  a  protein  with 
high  homology  to  a  centromeric  protein  INCENP;  and  a  homologue  of  the  S. 
cerevisiae  SCD6  gene  which  may  be  involved  in  vesicle  fusion  and  trafficking,  a 
process  thought  to  be  involved  in  C.  elegans  cytokinesis. 

Future  Work 

I  am  in  the  process  of  extending  the  RNAi  analysis  of  the  C.  elegans  genome  to 
encompass  the  remaining  five  chromosomes.  I  anticipate  that  construction  of 
the  dsRNA-expressing  bacterial  library  should  be  complete  by  early  2001,  and 
that  analysis  of  the  RNAi  phenotypes  of  all  embryonic  lethal  genes  by  time-lapse 
videomicroscopy  should  be  complete  by  summer  2001.  By  the  time  this  screen  is 
completed,  I  expect  to  have  identified  30-50  genes  that  are  required  for 
cytokinesis.  This  should  have  greatly  expanded  our  knowledge  of  the  molecular 
components  of  the  cytokinesis  machinery  of  C.  elegans.  I  will  then  attempt  to 
elucidate  how  BIR-1  interacts  with  the  identified  gene  products  and  thus  to 
understand  the  involvement  of  BIR-1  in  cytokinesis. 

Achievements 


Cloning  of  2496  genes  to  generate  a  dsRNA-expressing  bacterial  library. 


•  Screening  the  2496  gene  bacterial  library  to  identify  RNAi  phenotypes  of 
the  cloned  genes. 

•  Identification  of  339  genes  with  RNAi  phenotypes,  of  which  221  are 
required  for  embryonic  viability.  Detailed  analysis  of  these  data,  culminating  in 
publication  as  a  Nature  Article  (attached). 

•  Timelapse  videomicroscopic  analysis  of  the  RNAi  phenotypes  of  embryos 
for  150  of  the  221  genes  required  for  embryonic  viability. 


Identification  of  5  new  genes  required  for  cytokinesis  in  C.  elegans. 


Systematic  functional  genomic  analysis  of  C.  elegans 
Chromosome  I  by  RNA  interference 

Andrew  G.  Fraser*1,  Ravi  S.  Kamath*1,  Peder  Zipperlen*1,  Maruxa  Martinez- 
Campos*,  Marc  Sohrmannf  and  Julie  Ahringer*. 

*Wellcome/CRC  Institute,  University  of  Cambridge,  Tennis  Court  Road,  CB2  1QR 
Cambridge,  UK.  Phone  +44-1223-334144;  Fax  +44-1223-334089.  fThe  Sanger 
Centre,  Wellcome  Trust  Genome  Campus,  Hinxton,  Cambridge  CB10  ISA,  UK. 

1  These  authors  contributed  equally  to  this  work. 


Complete  genomic  sequence  is  known  for  two  multicellular  eukaryotes,  the 
nematode  Caenorhabditis  elegans  and  the  fruit  fly  Drosophila  melanogaster,  and 
will  be  soon  for  humans.  However,  biological  function  has  been  assigned  to 
only  a  small  proportion  of  the  predicted  genes  in  any  animal.  We  used  RNA- 
mediated  interference  (RNAi)  to  target  nearly  90%  of  predicted  genes  on  C. 
elegans  Chromosome  I  by  feeding  worms  with  bacteria  that  express  double 
stranded  RNA.  We  have  assigned  function  to  13.9%  of  the  genes  analysed, 
increasing  the  number  of  sequenced  genes  with  known  phenotypes  on 
Chromosome  I  from  70  to  378.  While  most  genes  with  sterile  or  embryonic 
lethal  RNAi  phenotypes  are  involved  in  basal  cell  metabolism,  many  genes 
giving  post-embryonic  phenotypes  have  conserved  but  unknown  function.  In 
addition,  conserved  genes  are  significantly  more  likely  to  have  an  RNAi 
phenotype  than  genes  with  no  conservation.  We  have  constructed  a  reusable 
library  of  bacterial  clones  that  permits  unlimited  future  RNAi  screens,  which 
should  help  develop  a  more  complete  view  of  the  relationships  between  the 
genome,  gene  function,  and  the  environment. 
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The  complete  genomic  sequence  of  an  organism  is  an  invaluable  tool  in 
understanding  the  molecular  mechanisms  underlying  its  development  and  function. 

The  nematode  worm  C.  elegans  is  one  of  two  multicellular  eukaryotes  for  which 
essentially  complete  genomic  sequence  is  known  f7.  36%  of  predicted  C.  elegans 
genes  have  a  significant  human  match1*3  including  many  genes  implicated  in  human 
diseases3*4,  and  functional  analysis  of  the  C.  elegans  genome  has  shed  light  on  many 
conserved  biological  processes  and  molecular  pathways.  A  comprehensive  functional 
analysis  of  all  genes  in  C.  elegans  would  greatly  expand  our  knowledge  of  conserved 
gene  function.  We  therefore  decided  to  investigate  systematically  loss-of-function 
phenotypes  of  predicted  genes  of  C.  elegans,  starting  with  Chromosome  I. 

RNA-mediated  interference  (RNAi)  is  a  technique  whereby  the  activity  of  a 
gene  is  transiently  inhibited  following  the  introduction  of  double-stranded  RNA 
(dsRNA)  of  sequence  specific  to  the  targeted  gene5.  The  specificity  and  potency  of 
RNAi  make  it  ideal  for  investigating  gene  function  beginning  only  with  genomic 
sequence6.  Ingestion  of  dsRNA-expressing  bacteria  results  in  RNAi  of  the  targeted 
gene7,  and  we  previously  established  that  this  technique  is  at  least  as  effective  as  the 
injection  of  dsRNA  for  RNAi8:  embryonic  lethal  phenotypes  are  detected  with 
similar  efficiency  by  feeding  and  injection,  but  feeding  detects  over  50%  more  post- 
embryonic  phenotypes  than  injection.  It  is  thus  possible  to  make  a  library  of 
dsRNA-expressing  bacteria  which  could  be  used  for  high-throughput  genome-wide 
RNAi  screens  at  very  low  cost.  It  is  important  to  note  that  since  RNAi  does  not 
efficiently  inhibit  all  genes,  an  RNAi-based  screen  will  miss  some  relevant  genes. 
Despite  this  caveat,  RNAi  is  a  useful  screening  tool  to  complement  classical  forward 
genetics. 

Analysis  of  the  function  of  genes  on  Chromosome  I  by  RNAi 

We  constructed  a  library  of  bacteria  expressing  dsRNA  corresponding  to  genes 
on  Chromosome  I.  Chromosome  I  is  the  second  smallest  chromosome,  has  few 
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duplicated  gene  clusters  and  has  no  striking  unusual  features1.  Each  individual 
bacterial  clone  is  able  to  synthesise  dsRNA  designed  to  target  a  single  gene;  since  gene 
predictions  are  still  changing,  a  few  primer  pairs  no  longer  correspond  to  single  genes 
(see  Methods).  In  total,  the  resulting  library  contains  2445  independent  clones, 
corresponding  to  2416  predicted  genes,  a  total  of  87.3%  of  the  2769  currently 
predicted  genes  of  Chromosome  I. 

We  screened  the  library  to  identify  genes  whose  inhibition  gives  a  clearly  detectable 
phenotype  in  wild-type  worms  as  described  in  Methods.  We  were  able  to  assign  a 
phenotype  to  13.9%  of  the  analysed  genes,  raising  the  number  of  sequenced  genes  on 
chromosome  I  with  known  phenotypes  from  70  to  378  (Table  1).  Many  genes  have 
more  than  one  associated  phenotype,  reflecting  that  genes  frequently  have  multiple 
functions  in  the  organism.  Furthermore,  since  we  examined  worms  that  were  only 
exposed  to  dsRNA  as  larvae  or  adults  as  well  as  their  progeny,  we  could  assign  post- 
embryonic  phenotypes  to  genes  that  result  in  sterility  or  produce  100%  embryonic 
lethal  progeny.  A  summary  of  these  results  and  a  partial  listing  of  the  phenotypes 
obtained  are  given  in  Tables  1  and  3.  Full  results  are  in  Supplementary  Table  1  and 
arepublicly  accessible  in  WormBase  (www.wormbase.org). 

Our  screen  was  sufficiently  effective  to  identify  90%  of  known  embryonic 
lethal  genes.  In  addition,  we  were  able  to  assign  phenotypes  to  45%  of  genes  with  a 
known  post-embryonic  phenotype  that  should  have  been  detectable  in  our  screen 
(Table  2  and  Supplementary  Table  2).  However,  we  failed  to  find  phenotypes  for  a 
number  of  previously  characterised  genes.  In  some  cases  (e.  g.  fog-3),  this  was  not 
due  to  an  inherent  difficulty  in  inhibiting  the  genes  using  RNAi  (since  we  obtained  the 
correct  phenotype  in  a  separate  experiment),  but  simply  because  we  overlooked  them 
in  the  screen.  However,  only  one  of  eight  genes  involved  in  neuronal  function  gave  a 
detectable  RNAi  phenotype;  this  accords  well  with  our  finding  that  neurons  appear  to 
be  more  resistant  to  RNAi  than  other  cell  types8.  Similarly,  we  did  not  detect 
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phenotypes  for  several  genes  involved  in  sperm  development  (fer-1,  spe-9,  and  spe¬ 
ll). 


The  largest  phenotypic  class,  comprising  over  60%  of  the  genes,  are  those 
whose  inhibition  by  RNAi  gives  rise  to  embryonic  lethality,  the  Emb  genes;  these 
include  a  large  number  of  components  of  the  basal  cellular  machinery.  More 
interestingly,  we  find  a  homologue  of  the  SMN  human  disease  gene9,  a  variety  of 
genes  encoding  RNA-binding  proteins  (several  such  proteins  play  a  role  in  early 
polarity;  reviewed  in  10),  a  number  of  genes  involved  in  chromosome  condensation  and 
separation,  components  of  signal  transduction  pathways  and  many  conserved  genes 
that  have  no  known  biochemical  function. 

The  largest  class  of  post-embryonic  phenotype  is  the  Uncoordinated  (Unc) 
class.  Unc  phenotypes  arise  from  defects  in  the  development  or  function  of  the 
neuromuscular  system  (reviewed  in  1 1).  We  find  Unc  genes  encoding  proteins 
involved  in  vesicle  sorting  and  fusion  as  well  as  transcription  factors  (including  a 
homologue  of  the  zinc  finger  transcription  factor  MYT-1  which  is  only  expressed  in 
developing  neurons  in  mammals12'14)  and  components  of  the  cytoskeleton  (e.  g.  a 
kakapo15'18  and  a  talin19  homologue). 

A  number  of  genes  showed  a  high  incidence  of  males  (Him)  phenotype.  C. 
elegans  is  usually  grown  as  a  self-fertilising  hermaphrodite  with  males  arising  at  a  low 
frequency  in  wild-type  cultures  due  to  non-disjunction  of  the  X-chromosome 
(hermaphrodites  have  two  X  chromosomes,  males  only  one).  An  increased  number  of 
males  is  indicative  of  either  the  incorrect  segregation  and  maintenance  of  chromosomes 
in  the  germ  line  (reviewed  in  20)  or  defects  in  sexual  specification.  The  Him  genes  that 
we  identified  include  kinesins,  a  katanin  homologue21’22  and  a  nuclear  hormone 
receptor. 

Conservation  of  genes  with  RNAi  phenotypes  across  eukaryotes 
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We  examined  the  level  of  cross-species  conservation  of  the  genes  for  which  we 
detected  an  RNAi  phenotype  (Fig  1).  To  find  C.  elegans  genes  that  are  conserved  in 
other  species,  we  identified  C.  elegans  genes  that  have  hits  with  BlastP23  e-values 
below  1 .00E-06  in  Saccharomyces  cerevisiae,  Drosophila  melanogaster  or  humans; 
we  define  these  as  a  “match”.  Hits  with  BlastP  e-values  below  1 .00E-1 0  and  in  which 
the  conservation  extends  over  at  least  80%  of  the  C.  elegans  protein  length,  we 
defined  as  “homologues”;  this  category  includes  orthologues.  This  provides  a 
conservative  estimate  of  the  number  of  genes  with  regions  of  conservation  (matches) 
or  homologues,  respectively. 

We  found  that  genes  with  RNAi  phenotypes  were  much  more  likely  to  have  a  match 
(p<0.001)  compared  to  all  genes  (Fig  1).  Most  striking  is  the  similarity  that  we  see 
between  C.  elegans  and  Drosophila :  while  42%  of  C.  elegans  genes  have  a  match  and 
1 9%  have  a  homologue  in  Drosophila,  we  find  that  over  72%  of  genes  with  an  RNAi 
phenotype  have  a  Drosophila  match  and  43%  have  a  homologue  (Fig  1).  This 
analysis  shows  that  genes  with  a  required  function  in  C.  elegans  have  been  highly 
conserved  across  eukaryotic  evolution.  We  also  find  that  highly  conserved  genes  are 
more  likely  to  have  an  RNAi  phenotype  than  genes  that  show  no  conservation:  26% 
of  C.  elegans  genes  that  have  a  homologue  in  one  of  the  organisms  examined  give  an 
RNAi  phenotype  compared  to  only  5%  of  genes  with  no  conservation  (p<0.001). 

Physical  distribution  on  chromosome  I  of  genes  with  RNAi  phenotypes 

Genes  for  which  we  identified  an  RNAi  phenotype  are  evenly  distributed 
across  the  chromosome  with  the  exception  of  two  regions  (corresponding  to  segments 
2  and  8-9  in  Fig  2a)  for  which  there  appears  to  be  a  drop  in  number  (p<0.1).  These 
two  regions  correspond  to  the  two  regions  of  chromosome  I  that  contain  locally 
duplicated  gene  clusters1.  We  suggest  that  the  reduction  in  the  number  of  phenotypes 
observed  by  RNAi  in  these  regions  may  be  due  to  gene  duplication  and  thus 
redundancy  of  function.  It  is  worth  noting  that  some  of  the  predicted  genes  in  the 
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duplicated  regions  may  not  be  expressed:  while  genes  with  RNAi  phenotypes  are 
equally  likely  to  have  an  EST  in  all  regions  of  the  genome  (see  below),  there  is  a 
significant  drop  (p<0.05)  in  the  proportion  of  total  genes  with  ESTs  in  the  second 
locally  duplicated  gene  cluster  region  (Fig  2b;  39%  of  genes  in  the  second  cluster  have 
an  EST  compared  with  53%  over  the  entire  chromosome).  We  suggest  that  a  portion 
of  the  predicted  genes  in  such  regions  of  duplication  may  in  fact  be  pseudogenes. 

Genes  that  give  RNAi  phenotypes  are  much  more  likely  to  have  an  EST  than 
genes  on  chromosome  I  in  general  (82%  versus  53%  respectively,  pO.OOl;  Fig  2b). 
The  relatively  high  percentage  of  genes  with  RNAi  phenotypes  that  have  ESTs  may 
reflect  that  these  genes  are  expressed  at  higher  levels.  It  may  also  be  that  many  genes 
that  currently  lack  ESTs  are  only  expressed  conditionally;  we  are  unlikely  to  have 
found  phenotypes  for  such  genes. 

In  C.  elegans,  there  is  evidence  of  differences  between  the  chromosome  arms 
and  the  central  regions  (the  clusters),  suggesting  that  there  might  be  differences  in  gene 
type  or  function  across  the  chromosome24.  In  general,  the  distribution  of  genes  in  any 
given  phenotypic  class  was  similar  to  that  for  all  genes  with  an  RNAi  phenotype  (e.  g. 
Emb  genes;  compare  Fig  2c  with  2a).  However,  genes  with  viable  post-embryonic 
phenotypes  (Pep  genes)  —  those  that  gave  a  post-embryonic  phenotype  without  any 
embryonic  or  post-embryonic  lethality,  sterility,  or  developmental  delay  —  show  a 
trend  toward  enrichment  at  the  arms  of  chromosome  I  (p<0. 1).  It  has  been  suggested 
that  the  chromosome  arms  may  be  more  prone  to  mutation  and  recombination  than  the 
central  core  portion24  and,  if  so,  that  novel  gene  functions  are  more  likely  to  evolve  in 
such  regions.  Our  finding  that  genes  which  uniquely  affect  post-embryonic 
development  cluster  at  the  arms  supports  this  model. 

Relationships  between  the  predicted  biochemical  function  of  a  gene  product 
and  its  RNAi  phenotype 
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To  explore  the  relationship  between  the  biochemical  function  of  a  gene  product 
and  its  mutant  phenotype,  we  categorised  the  sterile  (Ste),  embryonic  lethal  (Emb), 
uncoordinated  (Unc)  and  viable  post-embryonic  phenotype  (Pep)  genes  into  the 
functional  classes  shown  in  Fig  3a. 

Unsurprisingly,  genes  involved  in  basal  metabolic  processes  account  for  ~50% 
of  Ste  and  Emb  genes  (Fig.  3a);  this  confirms  that  these  basic  biochemical  processes 
are  indeed  essential  for  viability.  In  contrast,  under  20%  of  Unc  and  Pep  genes  encode 
components  of  the  basal  metabolic  machinery,  whereas  more  than  twice  as  many 
encode  proteins  with  more  specialized  functions  (Figs.  3a,  b).  There  is  thus  a  clear 
difference  between  the  types  of  gene  required  for  germline  function  or  embryonic 
viability  (which  mainly  require  basal  machinery)  and  those  involved  in  later 
developmental  processes  which  appear  to  require  proteins  either  of  more  specialized 
functions  or  of  as  yet  unknown  function  (Fig.  3b). 

A  second  clear  trend  is  that  the  number  of  genes  of  unknown  function 
increases  greatly  in  the  Unc  and  Pep  genes,  making  this  the  largest  overall  class  for 
those  phenotypes  (Fig.  3).  This  shift  underlies  the  fact  that  while  we  know  a  great 
deal  about  basic  metabolic  processes  of  eukaryotic  cells  (and  thus  can  readily  ascribe 
function  to  a  large  proportion  of  Ste  and  Emb  genes),  much  is  still  to  be  learnt  about 
the  complex  processes  and  the  genes  that  regulate  the  development  and  function  of  a 
multicellular  eukaryote.  A  significant  number  (~25%)  of  genes  of  unknown  function 
have  close  homologues  in  Drosophila  or  humans;  further  study  of  these  may  shed 
light  on  conserved  processes  specific  to  animals. 

Comparison  of  genes  essential  for  viability  of  S.  cerevisiae  and  C.  elegans 

S.  cerevisiae  was  the  first  eukaryote  to  be  completely  sequenced25  and  reverse 
genetics  has  been  used  extensively  to  investigate  S.  cerevisiae  gene  function.  In  a  set 
of  3680  genes  knocked  out  by  targeted  disruption,  890  affect  viability26;  we  compared 
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these  genes  to  those  that  gave  different  RNAi  phenotypes  in  C.  elegans.  Yeast  and 
worm  genes  important  for  viability  have  a  similar  distribution  within  the  different 
functional  classes,  but  are  different  from  the  Unc  or  Pep  distributions  (Fig  3c;  also 
compare  to  3a  and  3b).  This  suggests  that  similar  types  of  gene  are  required  for 
viability  of  yeast  and  animal  cells.  A  striking  difference  (p<0.001)  is  that  only  ~1%  of 
the  genes  required  for  viability  in  yeast  are  transcription  factors,  whereas  for  C. 
elegans  it  is  ~4%  (a  similar  percentage  of  the  genomes  of  yeast27  and  C.  elegans 2 
encode  transcription  factors,  3.3%  and  2.5%  respectively).  This  suggests  that  a  large 
fraction  of  the  C.  elegans  transcription  factors  required  for  viability  may  be  involved 
in  specific  developmental  processes. 

An  estimate  of  the  size  of  the  functionally  non-redundant  genome 

What  do  our  data  tell  us  about  the  size  of  the  functionally  non-redundant 
genome?  We  screened  12.7%  of  the  C.  elegans  genome  and  found  that  339  genes  gave 
a  clearly  discernible  phenotype.  Taking  into  account  the  sensitivity  of  our  screen  and 
scaling  up  to  the  entire  genome,  we  estimate  that  ~5400  genes  will  be  individually 
required  for  wild-type  C.  elegans  development  under  standard  laboratory  conditions 
(-'•2300  genes  for  embryonic  viability  and  ~3 1 00  post-embryonically;  see  Methods  for 
calculation).  This  is  comparable  to  previous  estimates  based  on  forward  genetics28. 
We  expect  that  phenotypes  for  other  genes  will  be  identified  under  novel  conditions 
(e.  g.  environmental  stress),  in  other  genetic  backgrounds,  or  using  more  refined  and 
restricted  screening  conditions. 

Discussion 

We  have  taken  a  systematic  approach  to  identify  functions  for  the  predicted 
genes  of  C.  elegans  Chromosome  I.  This  is  the  first  large-scale  reverse  genetic 
analysis  of  a  multicellular  organism  and  has  increased  by  five-fold  the  number  of 
sequenced  genes  with  known  phenotypes  on  this  chromosome. 
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While  we  have  identified  RNAi  phenotypes  for  many  genes,  some  will  have 
eluded  our  screen  for  one  of  at  least  two  reasons.  Firstly,  RNAi  may  have  been 
ineffective  against  the  targeted  gene.  RNAi  does  not  accurately  phenocopy  the  null 
phenotype  of  all  genes  (e.  g.  genes  involved  in  neuronal  function),  and  may  result  in 
either  partial  or  no  loss  of  function.  It  should  also  be  noted  that  if  multiple  genes  have 
regions  of  identical  or  near-identical  nucleotide  sequence,  RNAi  could  target  them 
simultaneously,  so  that  the  observed  phenotype  may  be  the  result  of  the  inhibition  of 
more  than  one  gene.  Secondly,  we  will  not  have  detected  either  subtle  or  conditional 
phenotypes.  However,  we  anticipate  that  future  RNAi-based  screens  using  specific 
assays  should  be  able  to  detect  phenotypes  for  many  more  genes,  thus  increasing  our 
understanding  of  C.  elegans  and  hence  of  metazoan  biology  in  general.  Since  our 
library  consists  of  bacterial  clones  that  can  be  replicated,  and  the  feeding  protocol  is 
relatively  simple  compared  with  injection,  the  library  can  be  used  repeatedly  at  low 
cost  and  high  efficiency  for  such  screens.  In  addition,  we  expect  that  a  feeding  library 
and  database  of  associated  phenotypes  will  prove  valuable  for  the  positional  cloning 
of  genes;  currently  there  are  over  300  genes  on  chromosome  I  identified  by  mutation 
but  not  yet  cloned. 

Although  the  time  needed  for  an  RNAi  screen  using  our  bacterial  library  is 
similar  to  that  for  a  classical  genetic  screen,  the  two  approaches  have  different 
advantages  and  will  yield  different  results.  Both  approaches  can  be  used  to  screen  the 
entire  genome  for  genes  involved  in  a  particular  process,  and  both  may  identify 
complete  or  partial  loss-of-function  phenotypes.  Classical  forward  genetics  generates 
stable  mutant  lines  that  can  be  maintained  indefinitely;  furthermore,  while  some  genes 
are  resistant  to  RNAi,  all  genes  are  sensitive  to  mutagens  (albeit  to  a  greater  or  lesser 
degree)  and  could  thus  be  cloned  using  a  classical  screen.  Also,  some  mutants  isolated 
by  forward  genetics  are  due  to  gain-of-function  mutations,  which  cannot  be  generated 
by  RNAi.  However,  the  positional  cloning  of  a  gene  is  often  slow  and  laborious. 
RNAi,  while  having  the  disadvantages  mentioned  above,  has  the  key  advantage  of  all 
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reverse  genetics:  the  sequence  of  the  gene  is  already  known,  and  thus  any  mutant 
phenotype  observed  is  automatically  connected  to  a  known  sequence. 

In  the  future,  we  aim  to  extend  our  library  construction  and  functional  analysis 
to  the  entire  C.  elegans  genome  and  anticipate  that  the  possibility  of  genome-wide 
RNAi  screening,  in  conjunction  with  other  functional  genomics  approaches  such  as 
expression  analyses  using  microarrays29  and  two-hybrid  experiments30  will  accelerate 
C.  elegans  research. 
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Methods 


Generation  and  cloning  of  PCR  products.  PCR  products  were  synthesised 
using  BioTaq  polymerase  (Bioline)  in  a  reaction  containing  25ng  of  C.  elegans  genomic 
DNA,  20pmol  of  C.  elegans  GenePairs  primers  (Research  Genetics)  and  lOOpM 
dNTPs:  34  cycles  of  [94°C  30s,  58°C  30s,  72°C  90s]  were  followed  by  an  extension 
of  lhr  at  72°C  to  enhance  A-tailing  of  products.  Products  were  ligated  into  linearized 
T-tailed  L4440  vector7  and  transformed  into  the  HT1 15(DE3)  bacterial  strain  (L. 
Timmons  and  A.  Fire,  pers.  comm.)  using  standard  methods.  Colonies  containing 
correct  sized  insert  were  identified  by  PCR  using  vector  specific  oligos,  and  the  cloned 
inserts  confirmed  by  PCR  using  the  original  Research  Genetics  primer  pair.  Primer 
sequences  are  available  at  http://cmgm.stanford.edu/~kiinlab/primers.12-22-99.html. 

RNAi  screening.  RNAi  was  performed  essentially  as  described  in  Kamath  et  afi, 
where  feeding  data  on  86  of  the  2445  genes  described  here  was  previously  reported. 

In  brief,  4  wells  of  a  12-well  plate  containing  NGM  agar  +  ImM  IPTG  +  25pg/ml 
carbenicillin  were  inoculated  with  bacterial  cultures  grown  8-18  hours  for  each  targeted 
gene.  10-15  L3-L4  stage  worms  were  placed  in  the  first  of  the  4  wells  for  each  gene 
and  left  for  72hrs  at  15°C.  Three  worms,  now  young  adults,  were  removed  and 
individually  placed  on  three  remaining  wells  for  each  gene  and  allowed  to  lay  embryos 
for  24hrs  at  room  temperature;  the  three  worms  were  then  removed  (t=0).  The 
phenotypes  of  adults  and  progeny  remaining  in  the  first  well  were  scored  as  well  as  of 
the  progeny  in  wells  1-3.  Our  screen  was  not  ideal  for  detection  of  phenotypes  visible 
only  in  adults  (e.  g.  egg-laying  defective  and  progeny  sterile);  we  will  have  missed 
some  of  these.  Phenotypic  analysis  of  lethality/sterility  was  carried  out  at  t=24hr  and 
post-embryonic  phenotypes  were  analysed  by  two  independent  observers  at  t=36hr, 
t=48hr,  t=60hr  and  t=72hr.  Phenotypic  classes  were  defined  as  follows.  Embryonic 
lethal  (Emb)  reproducibly  has  10-100%  embryonic  lethality;  sterile  (Ste)  has  a  brood 
size  of  less  than  or  equal  to  10  (wild-type  worms  in  these  conditions  typically  give 
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over  50);  progeny  sterile  (Stp)  has  a  brood  size  of  less  than  or  equal  to  10  in  the 
progeny  of  fed  worms.  Post-embryonic  phenotypes  require  at  least  1 0%  of  the 
analysed  worms  to  display  a  given  phenotype;  phenotypic  classes  are  given  in  Table 
1  legend.  A  full  listing  of  phenotypes  obtained  is  given  in  Supplementary  Table  1 ; 
genes  that  we  did  not  clone,  and  thus  did  not  analyse,  are  given  in  Supplementary 
Table  3.  Thus,  any  GenePair  absent  from  both  Supplementary  Tables  1  and  3  was 
fed  and  did  not  give  a  detectable  mutant  phenotype. 

Bioinformatic  analyses  and  categorisation  of  genes  into  functional  classes. 

Analyses  were  carried  out  on  GenePairs  predictions  rather  than  currently  predicted 
genes  since  while  gene  predictions  change,  phenotypes  will  always  match  the 
GenePair.  ~95%  of  GenePairs  genes  have  a  one-to-one  match  with  a  currently 
predicted  gene.  Current  gene  predictions  that  are  targeted  for  RNAi  by  the  primer 
pairs  were  identified  by  comparing  electronic  PCR  (ePCR)  fragments  (generated  using 
the  ePCR  program  (ftp.ncbi.nlm.nih.gov/pub/schuler/e-PCR)31  on  the  whole 
chromosome  DNA  files  from  the  WS9  release  of  ACeDB 
(ftp.sanger.ac.uk/pub/wormbase))  to  gene  predictions  in  ACeDB.  To  identify 
additional  genes  that  might  be  targeted  for  RNAi  by  a  particular  clone  we  found  those 
with  an  overlap  of  200bp  or  more  with  greater  than  80%  nucleotide  identity  with  the 
predicted  PCR  product  (asterisks  in  column  2  of  Table  3  denote  GenePairs  that  have 
such  matches);  however  it  is  not  yet  known  what  level  of  identity  is  required  for 
RNAi. 

To  find  C.  elegans  genes  with  conservation  in  other  organisms,  BlastP23  was  carried 
out  for  each  individual  C.  elegans  gene  on  Chromosome  I  against  S.  cerevisiae, 
Drosophila  melanogaster  and  human  sequences.  The  databases  used  were  as  follows: 
C.  elegans  (18337  entries),  S.  cerevisae  (6191  entries)  and  D.  melanogaster  (13743 
entries)  downloaded  on  1  June  2000  from  www.ebi.ac.uk/proteome;  and  H.  sapiens 
(35723  entries,  confirmed  peptides)  downloaded  on  1  June  2000  from 
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Complete  genomic  sequence  is  known  for  two  multicellular  eukaryotes,  the 
nematode  Caenorhabditis  elegans  and  the  fruit  fly  Drosophila  melanogaster,  and 
will  be  soon  for  humans.  However,  biological  function  has  been  assigned  to 
only  a  small  proportion  of  the  predicted  genes  in  any  animal.  We  used  RNA- 
mediated  interference  (RNAi)  to  target  nearly  90%  of  predicted  genes  on  C. 
elegans  Chromosome  I  by  feeding  worms  with  bacteria  that  express  double 
stranded  RNA.  We  have  assigned  function  to  13.9%  of  the  genes  analysed, 
increasing  the  number  of  sequenced  genes  with  known  phenotypes  on 
Chromosome  I  from  70  to  378.  While  most  genes  with  sterile  or  embryonic 
lethal  RNAi  phenotypes  are  involved  in  basal  cell  metabolism,  many  genes 
giving  post-embryonic  phenotypes  have  conserved  but  unknown  function.  In 
addition,  conserved  genes  are  significantly  more  likely  to  have  an  RNAi 
phenotype  than  genes  with  no  conservation.  We  have  constructed  a  reusable 
library  of  bacterial  clones  that  permits  unlimited  future  RNAi  screens,  which 
should  help  develop  a  more  complete  view  of  the  relationships  between  the 
genome,  gene  function,  and  the  environment. 
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The  complete  genomic  sequence  of  an  organism  is  an  invaluable  tool  in 
understanding  the  molecular  mechanisms  underlying  its  development  and  function. 
The  nematode  worm  C.  elegans  is  one  of  two  multicellular  eukaryotes  for  which 
essentially  complete  genomic  sequence  is  known1-2.  36%  of  predicted  C.  elegans 
genes  have  a  significant  human  match1-3  including  many  genes  implicated  in  human 
diseases3-4,  and  functional  analysis  of  the  C.  elegans  genome  has  shed  light  on  many 
conserved  biological  processes  and  molecular  pathways.  A  comprehensive  functional 
analysis  of  all  genes  in  C.  elegans  would  greatly  expand  our  knowledge  of  conserved 
gene  function.  We  therefore  decided  to  investigate  systematically  loss-of-function 
phenotypes  of  predicted  genes  of  C.  elegans,  starting  with  Chromosome  I. 

RNA-mediated  interference  (RNAi)  is  a  technique  whereby  the  activity  of  a 
gene  is  transiently  inhibited  following  the  introduction  of  double-stranded  RNA 
(dsRNA)  of  sequence  specific  to  the  targeted  gene5.  The  specificity  and  potency  of 
RNAi  make  it  ideal  for  investigating  gene  function  beginning  only  with  genomic 
sequence6.  Ingestion  of  dsRNA-expressing  bacteria  results  in  RNAi  of  the  targeted 
gene7,  and  we  previously  established  that  this  technique  is  at  least  as  effective  as  the 
injection  of  dsRNA  for  RNAi8:  embryonic  lethal  phenotypes  are  detected  with 
similar  efficiency  by  feeding  and  injection,  but  feeding  detects  over  50%  more  post- 
embryonic  phenotypes  than  injection.  It  is  thus  possible  to  make  a  library  of 
dsRNA-expressing  bacteria  which  could  be  used  for  high-throughput  genome-wide 
RNAi  screens  at  very  low  cost.  It  is  important  to  note  that  since  RNAi  does  not 
efficiently  inhibit  all  genes,  an  RNAi-based  screen  will  miss  some  relevant  genes. 
Despite  this  caveat,  RNAi  is  a  useful  screening  tool  to  complement  classical  forward 
genetics. 

Analysis  of  the  function  of  genes  on  Chromosome  I  by  RNAi 

We  constructed  a  library  of  bacteria  expressing  dsRNA  corresponding  to  genes 
on  Chromosome  I.  Chromosome  I  is  the  second  smallest  chromosome,  has  few 


2 


duplicated  gene  clusters  and  has  no  striking  unusual  features'.  Each  individual 
bacterial  clone  is  able  to  synthesise  dsRNA  designed  to  target  a  single  gene;  since  gene 
predictions  are  still  changing,  a  few  primer  pairs  no  longer  correspond  to  single  genes 
(see  Methods).  In  total,  the  resulting  library  contains  2445  independent  clones, 
corresponding  to  2416  predicted  genes,  a  total  of  87.3%  of  the  2769  currently 
predicted  genes  of  Chromosome  I. 

We  screened  the  library  to  identify  genes  whose  inhibition  gives  a  clearly  detectable 
phenotype  in  wild-type  worms  as  described  in  Methods.  We  were  able  to  assign  a 
phenotype  to  13.9%  of  the  analysed  genes,  raising  the  number  of  sequenced  genes  on 
chromosome  I  with  known  phenotypes  from  70  to  378  (Table  1).  Many  genes  have 
more  than  one  associated  phenotype,  reflecting  that  genes  frequently  have  multiple 
functions  in  the  organism.  Furthermore,  since  we  examined  worms  that  were  only 
exposed  to  dsRNA  as  larvae  or  adults  as  well  as  their  progeny,  we  could  assign  post- 
embryonic  phenotypes  to  genes  that  result  in  sterility  or  produce  100%  embryonic 
lethal  progeny.  A  summary  of  these  results  and  a  partial  listing  of  the  phenotypes 
obtained  are  given  in  Tables  1  and  3.  Full  results  are  in  Supplementary  Table  1  and 
arepublicly  accessible  in  WormBase  (www.wormbase.org). 

Our  screen  was  sufficiently  effective  to  identify  90%  of  known  embryonic 
lethal  genes.  In  addition,  we  were  able  to  assign  phenotypes  to  45%  of  genes  with  a 
known  post-embryonic  phenotype  that  should  have  been  detectable  in  our  screen 
(Table  2  and  Supplementary  Table  2).  However,  we  failed  to  find  phenotypes  for  a 
number  of  previously  characterised  genes.  In  some  cases  (e.  g.  fog-3),  this  was  not 
due  to  an  inherent  difficulty  in  inhibiting  the  genes  using  RNAi  (since  we  obtained  the 
correct  phenotype  in  a  separate  experiment),  but  simply  because  we  overlooked  them 
in  the  screen.  However,  only  one  of  eight  genes  involved  in  neuronal  function  gave  a 
detectable  RNAi  phenotype;  this  accords  well  with  our  finding  that  neurons  appear  to 
be  more  resistant  to  RNAi  than  other  cell  types8.  Similarly,  we  did  not  detect 
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phenotypes  for  several  genes  involved  in  sperm  development  (fer-1 ,  spe-9,  and  spe¬ 
ll). 


The  largest  phenotypic  class,  comprising  over  60%  of  the  genes,  are  those 
whose  inhibition  by  RNAi  gives  rise  to  embryonic  lethality,  the  Emb  genes;  these 
include  a  large  number  of  components  of  the  basal  cellular  machinery.  More 
interestingly,  we  find  a  homologue  of  the  SMN  human  disease  gene9,  a  variety  of 
genes  encoding  RNA-binding  proteins  (several  such  proteins  play  a  role  in  early 
polarity;  reviewed  in  10),  a  number  of  genes  involved  in  chromosome  condensation  and 
separation,  components  of  signal  transduction  pathways  and  many  conserved  genes 
that  have  no  known  biochemical  function. 

The  largest  class  of  post-embryonic  phenotype  is  the  Uncoordinated  (Unc) 
class.  Unc  phenotypes  arise  from  defects  in  the  development  or  function  of  the 
neuromuscular  system  (reviewed  in  1  *).  We  find  Unc  genes  encoding  proteins 
involved  in  vesicle  sorting  and  fusion  as  well  as  transcription  factors  (including  a 
homologue  of  the  zinc  finger  transcription  factor  MYT-1  which  is  only  expressed  in 
developing  neurons  in  mammals12'14)  and  components  of  the  cytoskeleton  (e.  g.  a 
kakapo15'18  and  a  talin19  homologue). 

A  number  of  genes  showed  a  high  incidence  of  males  (Him)  phenotype.  C. 
elegans  is  usually  grown  as  a  self-fertilising  hermaphrodite  with  males  arising  at  a  low 
frequency  in  wild-type  cultures  due  to  non-disjunction  of  the  X-chromosome 
(hermaphrodites  have  two  X  chromosomes,  males  only  one).  An  increased  number  of 
males  is  indicative  of  either  the  incorrect  segregation  and  maintenance  of  chromosomes 
in  the  germ  line  (reviewed  in  20)  or  defects  in  sexual  specification.  The  Him  genes  that 
we  identified  include  kinesins,  a  katanin  homologue21’22  and  a  nuclear  hormone 
receptor. 

Conservation  of  genes  with  RNAi  phenotypes  across  eukaryotes 
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We  examined  the  level  of  cross-species  conservation  of  the  genes  for  which  we 
detected  an  RNAi  phenotype  (Fig  1).  To  find  C.  elegans  genes  that  are  conserved  in 
other  species,  we  identified  C.  elegans  genes  that  have  hits  with  BlastP23  e-values 
below  1 .00E-06  in  Saccharomyces  cerevisiae,  Drosophila  melanogaster  or  humans; 
we  define  these  as  a  “match”.  Hits  with  BlastP  e-values  below  1 .00E-10  and  in  which 
the  conservation  extends  over  at  least  80%  of  the  C.  elegans  protein  length,  we 
defined  as  “homologues”;  this  category  includes  orthologues.  This  provides  a 
conservative  estimate  of  the  number  of  genes  with  regions  of  conservation  (matches) 
or  homologues,  respectively. 

We  found  that  genes  with  RNAi  phenotypes  were  much  more  likely  to  have  a  match 
(p<0.001)  compared  to  all  genes  (Fig  1).  Most  striking  is  the  similarity  that  we  see 
between  C.  elegans  and  Drosophila :  while  42%  of  C.  elegans  genes  have  a  match  and 
1 9%  have  a  homologue  in  Drosophila,  we  find  that  over  72%  of  genes  with  an  RNAi 
phenotype  have  a  Drosophila  match  and  43%  have  a  homologue  (Fig  1).  This 
analysis  shows  that  genes  with  a  required  function  in  C.  elegans  have  been  highly 
conserved  across  eukaryotic  evolution.  We  also  find  that  highly  conserved  genes  are 
more  likely  to  have  an  RNAi  phenotype  than  genes  that  show  no  conservation:  26% 
of  C.  elegans  genes  that  have  a  homologue  in  one  of  the  organisms  examined  give  an 
RNAi  phenotype  compared  to  only  5%  of  genes  with  no  conservation  (p<0.001). 

Physical  distribution  on  chromosome  I  of  genes  with  RNAi  phenotypes 

Genes  for  which  we  identified  an  RNAi  phenotype  are  evenly  distributed 
across  the  chromosome  with  the  exception  of  two  regions  (corresponding  to  segments 
2  and  8-9  in  Fig  2a)  for  which  there  appears  to  be  a  drop  in  number  (p<0.1).  These 
two  regions  correspond  to  the  two  regions  of  chromosome  I  that  contain  locally 
duplicated  gene  clusters1.  We  suggest  that  the  reduction  in  the  number  of  phenotypes 
observed  by  RNAi  in  these  regions  may  be  due  to  gene  duplication  and  thus 
redundancy  of  function.  It  is  worth  noting  that  some  of  the  predicted  genes  in  the 
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duplicated  regions  may  not  be  expressed:  while  genes  with  RNAi  phenotypes  are 
equally  likely  to  have  an  EST  in  all  regions  of  the  genome  (see  below),  there  is  a 
significant  drop  (p<0.05)  in  the  proportion  of  total  genes  with  ESTs  in  the  second 
locally  duplicated  gene  cluster  region  (Fig  2b;  39%  of  genes  in  the  second  cluster  have 
an  EST  compared  with  53%  over  the  entire  chromosome).  We  suggest  that  a  portion 
of  the  predicted  genes  in  such  regions  of  duplication  may  in  fact  be  pseudogenes. 

Genes  that  give  RNAi  phenotypes  are  much  more  likely  to  have  an  EST  than 
genes  on  chromosome  I  in  general  (82%  versus  53%  respectively,  p<0.001;  Fig  2b). 
The  relatively  high  percentage  of  genes  with  RNAi  phenotypes  that  have  ESTs  may 
reflect  that  these  genes  are  expressed  at  higher  levels.  It  may  also  be  that  many  genes 
that  currently  lack  ESTs  are  only  expressed  conditionally;  we  are  unlikely  to  have 
found  phenotypes  for  such  genes. 

In  C.  elegans,  there  is  evidence  of  differences  between  the  chromosome  arms 
and  the  central  regions  (the  clusters),  suggesting  that  there  might  be  differences  in  gene 
type  or  function  across  the  chromosome24.  In  general,  the  distribution  of  genes  in  any 
given  phenotypic  class  was  similar  to  that  for  all  genes  with  an  RNAi  phenotype  (e.  g. 
Emb  genes;  compare  Fig  2c  with  2a).  However,  genes  with  viable  post-embryonic 
phenotypes  (Pep  genes)  —  those  that  gave  a  post-embryonic  phenotype  without  any 
embryonic  or  post-embryonic  lethality,  sterility,  or  developmental  delay  —  show  a 
trend  toward  enrichment  at  the  arms  of  chromosome  I  (p<0.1).  It  has  been  suggested 
that  the  chromosome  arms  may  be  more  prone  to  mutation  and  recombination  than  the 
central  core  portion24  and,  if  so,  that  novel  gene  functions  are  more  likely  to  evolve  in 
such  regions.  Our  finding  that  genes  which  uniquely  affect  post-embryonic 
development  cluster  at  the  arms  supports  this  model. 

Relationships  between  the  predicted  biochemical  function  of  a  gene  product 
and  its  RNAi  phenotype 
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To  explore  the  relationship  between  the  biochemical  function  of  a  gene  product 
and  its  mutant  phenotype,  we  categorised  the  sterile  (Ste),  embryonic  lethal  (Emb), 
uncoordinated  (Unc)  and  viable  post-embryonic  phenotype  (Pep)  genes  into  the 
functional  classes  shown  in  Fig  3a. 

Unsurprisingly,  genes  involved  in  basal  metabolic  processes  account  for  —50% 
of  Ste  and  Emb  genes  (Fig.  3a);  this  confirms  that  these  basic  biochemical  processes 
are  indeed  essential  for  viability.  In  contrast,  under  20%  of  Unc  and  Pep  genes  encode 
components  of  the  basal  metabolic  machinery,  whereas  more  than  twice  as  many 
encode  proteins  with  more  specialized  functions  (Figs.  3a,  b).  There  is  thus  a  clear 
difference  between  the  types  of  gene  required  for  germline  function  or  embryonic 
viability  (which  mainly  require  basal  machinery)  and  those  involved  in  later 
developmental  processes  which  appear  to  require  proteins  either  of  more  specialized 
functions  or  of  as  yet  unknown  function  (Fig.  3b). 

A  second  clear  trend  is  that  the  number  of  genes  of  unknown  function 
increases  greatly  in  the  Unc  and  Pep  genes,  making  this  the  largest  overall  class  for 
those  phenotypes  (Fig.  3).  This  shift  underlies  the  fact  that  while  we  know  a  great 
deal  about  basic  metabolic  processes  of  eukaryotic  cells  (and  thus  can  readily  ascribe 
function  to  a  large  proportion  of  Ste  and  Emb  genes),  much  is  still  to  be  learnt  about 
the  complex  processes  and  the  genes  that  regulate  the  development  and  function  of  a 
multicellular  eukaryote.  A  significant  number  (-25%)  of  genes  of  unknown  function 
have  close  homologues  in  Drosophila  or  humans;  further  study  of  these  may  shed 
light  on  conserved  processes  specific  to  animals. 

Comparison  of  genes  essential  for  viability  of  S.  cerevisiae  and  C.  elegans 

S.  cerevisiae  was  the  first  eukaryote  to  be  completely  sequenced25  and  reverse 
genetics  has  been  used  extensively  to  investigate  S.  cerevisiae  gene  function.  In  a  set 
of  3680  genes  knocked  out  by  targeted  disruption,  890  affect  viability26;  we  compared 
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these  genes  to  those  that  gave  different  RNAi  phenotypes  in  C.  elegans.  Yeast  and 
worm  genes  important  for  viability  have  a  similar  distribution  within  the  different 
functional  classes,  but  are  different  from  the  Unc  or  Pep  distributions  (Fig  3c;  also 
compare  to  3a  and  3b).  This  suggests  that  similar  types  of  gene  are  required  for 
viability  of  yeast  and  animal  cells.  A  striking  difference  (p<0.001)  is  that  only  ~1%  of 
the  genes  required  for  viability  in  yeast  are  transcription  factors,  whereas  for  C. 
elegans  it  is  -4%  (a  similar  percentage  of  the  genomes  of  yeast27  and  C.  elegans 2 
encode  transcription  factors,  3.3%  and  2.5%  respectively).  This  suggests  that  a  large 
fraction  of  the  C.  elegans  transcription  factors  required  for  viability  may  be  involved 
in  specific  developmental  processes. 

An  estimate  of  the  size  of  the  functionally  non-redundant  genome 

What  do  our  data  tell  us  about  the  size  of  the  functionally  non-redundant 
genome?  We  screened  12.7%  of  the  C.  elegans  genome  and  found  that  339  genes  gave 
a  clearly  discernible  phenotype.  Taking  into  account  the  sensitivity  of  our  screen  and 
scaling  up  to  the  entire  genome,  we  estimate  that  -5400  genes  will  be  individually 
required  for  wild-type  C.  elegans  development  under  standard  laboratory  conditions 
(-2300  genes  for  embryonic  viability  and  -3100  post-embryonically;  see  Methods  for 
calculation).  This  is  comparable  to  previous  estimates  based  on  forward  genetics28. 
We  expect  that  phenotypes  for  other  genes  will  be  identified  under  novel  conditions 
(e.  g.  environmental  stress),  in  other  genetic  backgrounds,  or  using  more  refined  and 
restricted  screening  conditions. 

Discussion 

We  have  taken  a  systematic  approach  to  identify  functions  for  the  predicted 
genes  of  C.  elegans  Chromosome  I.  This  is  the  first  large-scale  reverse  genetic 
analysis  of  a  multicellular  organism  and  has  increased  by  five-fold  the  number  of 
sequenced  genes  with  known  phenotypes  on  this  chromosome. 
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While  we  have  identified  RNAi  phenotypes  for  many  genes,  some  will  have 
eluded  our  screen  for  one  of  at  least  two  reasons.  Firstly,  RNAi  may  have  been 
ineffective  against  the  targeted  gene.  RNAi  does  not  accurately  phenocopy  the  null 
phenotype  of  all  genes  (e.  g.  genes  involved  in  neuronal  function),  and  may  result  in 
either  partial  or  no  loss  of  function.  It  should  also  be  noted  that  if  multiple  genes  have 
regions  of  identical  or  near-identical  nucleotide  sequence,  RNAi  could  target  them 
simultaneously,  so  that  the  observed  phenotype  may  be  the  result  of  the  inhibition  of 
more  than  one  gene.  Secondly,  we  will  not  have  detected  either  subtle  or  conditional 
phenotypes.  However,  we  anticipate  that  future  RNAi-based  screens  using  specific 
assays  should  be  able  to  detect  phenotypes  for  many  more  genes,  thus  increasing  our 
understanding  of  C.  elegans  and  hence  of  metazoan  biology  in  general.  Since  our 
library  consists  of  bacterial  clones  that  can  be  replicated,  and  the  feeding  protocol  is 
relatively  simple  compared  with  injection,  the  library  can  be  used  repeatedly  at  low 
cost  and  high  efficiency  for  such  screens.  In  addition,  we  expect  that  a  feeding  library 
and  database  of  associated  phenotypes  will  prove  valuable  for  the  positional  cloning 
of  genes;  currently  there  are  over  300  genes  on  chromosome  I  identified  by  mutation 
but  not  yet  cloned. 

Although  the  time  needed  for  an  RNAi  screen  using  our  bacterial  library  is 
similar  to  that  for  a  classical  genetic  screen,  the  two  approaches  have  different 
advantages  and  will  yield  different  results.  Both  approaches  can  be  used  to  screen  the 
entire  genome  for  genes  involved  in  a  particular  process,  and  both  may  identify 
complete  or  partial  loss-of-function  phenotypes.  Classical  forward  genetics  generates 
stable  mutant  lines  that  can  be  maintained  indefinitely;  furthermore,  while  some  genes 
are  resistant  to  RNAi,  all  genes  are  sensitive  to  mutagens  (albeit  to  a  greater  or  lesser 
degree)  and  could  thus  be  cloned  using  a  classical  screen.  Also,  some  mutants  isolated 
by  forward  genetics  are  due  to  gain-of-function  mutations,  which  cannot  be  generated 
by  RNAi.  However,  the  positional  cloning  of  a  gene  is  often  slow  and  laborious. 
RNAi,  while  having  the  disadvantages  mentioned  above,  has  the  key  advantage  of  all 
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reverse  genetics:  the  sequence  of  the  gene  is  already  known,  and  thus  any  mutant 
phenotype  observed  is  automatically  connected  to  a  known  sequence. 

In  the  future,  we  aim  to  extend  our  library  construction  and  functional  analysis 
to  the  entire  C.  elegans  genome  and  anticipate  that  the  possibility  of  genome-wide 
RNAi  screening,  in  conjunction  with  other  functional  genomics  approaches  such  as 
expression  analyses  using  microarrays29  and  two-hybrid  experiments30  will  accelerate 
C.  elegans  research. 
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Methods 


Generation  and  cloning  of  PCR  products.  PCR  products  were  synthesised 
using  BioTaq  polymerase  (Bioline)  in  a  reaction  containing  25ng  of  C.  elegans  genomic 
DNA,  20pmol  of  C.  elegans  GenePairs  primers  (Research  Genetics)  and  lOOpM 
dNTPs:  34  cycles  of  [94°C  30s,  58°C  30s,  72°C  90s]  were  followed  by  an  extension 
of  1  hr  at  72°C  to  enhance  A-tailing  of  products.  Products  were  ligated  into  linearized 
T-tailed  L4440  vector7  and  transformed  into  the  HT1 15(DE3)  bacterial  strain  (L. 
Timmons  and  A.  Fire,  pers.  comm.)  using  standard  methods.  Colonies  containing 
correct  sized  insert  were  identified  by  PCR  using  vector  specific  oligos,  and  the  cloned 
inserts  confirmed  by  PCR  using  the  original  Research  Genetics  primer  pair.  Primer 
sequences  are  available  at  http://cmgm.stanford.edu/~kiinlab/primers.12-22-99.html. 

RNAi  screening.  RNAi  was  performed  essentially  as  described  in  Kamath  et  a/8, 
where  feeding  data  on  86  of  the  2445  genes  described  here  was  previously  reported. 

In  brief,  4  wells  of  a  12-well  plate  containing  NGM  agar  +  ImM  IPTG  +  2 5 pg/ml 
carbenicillin  were  inoculated  with  bacterial  cultures  grown  8-18  hours  for  each  targeted 
gene.  10-15  L3-L4  stage  worms  were  placed  in  the  first  of  the  4  wells  for  each  gene 
and  left  for  72hrs  at  15°C.  Three  worms,  now  young  adults,  were  removed  and 
individually  placed  on  three  remaining  wells  for  each  gene  and  allowed  to  lay  embryos 
for  24hrs  at  room  temperature;  the  three  worms  were  then  removed  (t=0).  The 
phenotypes  of  adults  and  progeny  remaining  in  the  first  well  were  scored  as  well  as  of 
the  progeny  in  wells  1-3.  Our  screen  was  not  ideal  for  detection  of  phenotypes  visible 
only  in  adults  (e.  g.  egg-laying  defective  and  progeny  sterile);  we  will  have  missed 
some  of  these.  Phenotypic  analysis  of  lethality/sterility  was  carried  out  at  t=24hr  and 
post-embryonic  phenotypes  were  analysed  by  two  independent  observers  at  t=36hr, 
t=48hr,  t=60hr  and  t=72hr.  Phenotypic  classes  were  defined  as  follows.  Embryonic 
lethal  (Emb)  reproducibly  has  10-100%  embryonic  lethality;  sterile  (Ste)  has  a  brood 
size  of  less  than  or  equal  to  10  (wild-type  worms  in  these  conditions  typically  give 
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over  50);  progeny  sterile  (Stp)  has  a  brood  size  of  less  than  or  equal  to  10  in  the 
progeny  of  fed  worms.  Post-embryonic  phenotypes  require  at  least  10%  of  the 
analysed  worms  to  display  a  given  phenotype;  phenotypic  classes  are  given  in  Table 
1  legend.  A  full  listing  of  phenotypes  obtained  is  given  in  Supplementary  Table  1; 
genes  that  we  did  not  clone,  and  thus  did  not  analyse,  are  given  in  Supplementary 
Table  3.  Thus,  any  GenePair  absent  from  both  Supplementary  Tables  1  and  3  was 
fed  and  did  not  give  a  detectable  mutant  phenotype. 

Bioinformatic  analyses  and  categorisation  of  genes  into  functional  classes. 

Analyses  were  carried  out  on  GenePairs  predictions  rather  than  currently  predicted 
genes  since  while  gene  predictions  change,  phenotypes  will  always  match  the 
GenePair.  -95%  of  GenePairs  genes  have  a  one-to-one  match  with  a  currently 
predicted  gene.  Current  gene  predictions  that  are  targeted  for  RNAi  by  the  primer 
pairs  were  identified  by  comparing  electronic  PCR  (ePCR)  fragments  (generated  using 
the  ePCR  program  (ftp.ncbi.nlm.nih.gov/pub/schuler/e-PCR)31  on  the  whole 
chromosome  DNA  files  from  the  WS9  release  of  ACeDB 
(ftp.sanger.ac.uk/pub/wormbase))  to  gene  predictions  in  ACeDB.  To  identify 
additional  genes  that  might  be  targeted  for  RNAi  by  a  particular  clone  we  found  those 
with  an  overlap  of  200bp  or  more  with  greater  than  80%  nucleotide  identity  with  the 
predicted  PCR  product  (asterisks  in  column  2  of  Table  3  denote  GenePairs  that  have 
such  matches);  however  it  is  not  yet  known  what  level  of  identity  is  required  for 
RNAi. 

To  find  C.  elegans  genes  with  conservation  in  other  organisms,  BlastP23  was  carried 
out  for  each  individual  C.  elegans  gene  on  Chromosome  I  against  S.  cerevisiae, 
Drosophila  melanogaster  and  human  sequences.  The  databases  used  were  as  follows: 
C.  elegans  (18337  entries),  S.  cerevisae  (6191  entries)  and  D.  melanogaster  (13743 
entries)  downloaded  on  1  June  2000  from  www.ebi.ac.uk/proteome;  and  H.  sapiens 
(35723  entries,  confirmed  peptides)  downloaded  on  1  June  2000  from 
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www.ensembl.org.  NCBI-Blast2  was  used  (BLASTP  2.0.6)  with  the  SEG  filter,  and 
the  search  space  was  set  to  7947758. 

We  defined  “sequenced  genes  with  a  known  phenotype”  as  being  those  with  a  named 
entry  in  ACeDB  that  also  have  a  known  phenotype  entered  in  ACeDB,  WormBase 
(www.wormbase.org)  or  the  Proteome  database  (www.proteome.com).  EST  data  was 
supplied  by  the  Sanger  Centre  on  21  June  2000. 

Predicted  gene  products  were  placed  into  functional  classes  by  manual  inspection, 
primarily  using  data  from  Proteome,  InterPro  and  Blast  analysis^3>32_  The  functional 
classes  are:  (1)  DNA  synthesis;  (2)  RNA  synthesis  and  processing  including  general 
transcription  machinery,  splicing/processing,  RNA  binding  and  regulation  of 
chromatin;  (3)  Protein  synthesis  and  proteolysis  including  translation,  degradation  and 
folding;  (4)  Metabolism  including  energy  production  and  intermediary  metabolism;  (5) 
Cell  cycle  and  chromosome  dynamics;  (6)  Cell  biology  and  cellular  structure  including 
cell  junction/adhesion,  cytoskeleton,  ion  channels,  protein  trafficking  and  vesicle 
regulation  and  cell  polarity;  (7)  Gene  specific  transcription;  and  (8)  Signal  transduction 
including  kinases,  phosphatases  and  components  of  signal  transduction  pathways.The 
Unknown  functional  class  contains  genes  which  either  have  motifs  about  which  there 
is  insufficient  information  to  assign  a  function,  or  genes  with  no  significant  matches  in 
any  organism. 

Estimates  of  non-redundant  genome  size  were  done  as  follows.  We  detected  90.5%  of 
genes  known  to  give  an  embryonic  lethal  phenotype  and  32.6%  of  genes  known  to 
give  a  post-embryonic  phenotype.  After  screening  87.3%  of  the  genes  on 
chromosome  I,  we  identified  226  Emb  genes  and  113  genes  that  only  gave  a  post- 
embryonic  RNAi  phenotype  (including  steriles);  adjusting  for  our  efficiencies  of 
detection,  we  estimate  that  on  chromosome  I,  286  genes  should  be  required  for 
viability  and  -397  for  post-embryonic  processes.  We  screened  12.7%  of  the  genome, 
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and  thus  for  the  entire  genome  we  expect  2250  Emb  genes  and  3130  genes  to  have  a 
post-embryonic  phenotype. 
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Figure  Legends 


Table  1  Summary  of  phenotypes  arising  from  RNAi  of  genes  on 
Chromosome  I.  The  number  of  predicted  genes  whose  targeting  via  RNAi 
gave  rise  to  each  phenotype  is  shown.  Percentages  are  given  as 
percentage  of  total  number  of  clones  screened  (2445).  Phenotypic  classes 
were  defined  as  described  in  Methods.  The  phenotypes  are  Emb 
(embryonic  lethal),  Ste  (sterile),  Stp  (sterile  progeny),  Gro  (slow  post- 
embryonic  growth),  Lva  (larval  arrest),  Lvl  (larval  lethality),  Unc 
(uncoordinated),  Pvl  (protruding  vulva),  Bmd  (body  morphological  defects), 
Dpy  (dumpy),  Clr  (clear),  Him  (high  incidence  of  males),  Rup  (ruptured),  Mlt 
(molt  defects),  Prz  (paralyzed),  Sma  (small),  Egl  (egg-laying  defective),  Sck 
(sick),  Bli  (blistering  of  cuticle),  Muv  (multivulva),  Rol  (roller),  Adi  (adult 
lethal),  Lon  (long),  and  Hya  (hyperactive). 

Table  2  Detection  of  forward  genetic  loci  on  Chromosome  I  by  RNAi. 

RNAi  phenotypes  were  compared  to  those  of  genes  that  have  known  loss- 
of-function  phenotypes.  “Genetic  loci  fed”  denotes  the  number  of  genes  in 
each  category  that  were  analysed  by  RNAi.  “Possible  to  detect”  denotes  the 
number  of  genes  that  have  a  loss-of-function  phenotype  that  would  have 
been  detectable  in  our  screen.  ’’RNAi  phenotype  detected”  gives  the 
number  of  genes  for  which  a  phenotype  was  identified.  “Published 
phenotype  detected”  gives  the  number  of  genes  for  which  the  RNAi 
phenotype  matched  a  published  phenotype.  Supplementary  Table  2  gives 
full  data.  RNAi  could  reduce  both  maternal  gene  activity  in  the  PO  and 
zygotic  gene  activity  in  the  FI;  this  could  explain  some  of  the  differences 
between  RNAi  phenotypes  and  published  phenotypes. 
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Table  3  Partial  list  of  RNAi  phenotypes  of  genes  on  Chromosome  I.  RNAi 
phenotypes  are  shown  for  genes  in  the  following  functional  classes: 
Chromosome  dynamics  and  cell  cycle;  Cell  structure;  Specific  transcription; 
and  Signal  transduction.  For  each  gene,  the  following  data  are  shown:  the 
Research  Genetics  GenePairs  name;  whether  a  number  of  paralogues 
might  be  targeted  (asterisk  in  column  2;  methods  gives  criterion);  the 
corresponding  genetic  locus  name  if  it  exists;  a  short  description  of  gene 
function;  the  RNAi  phenotype  in  which  embryonic  lethality  (Emb),  fecundity 
(Ste),  post-embryonic  phenotypes  (PI-3)  and  developmental  delay  (Dev)  are 
shown  separately.  Emb  and  Ste  are  classified  into  weak  (white  box,  black 
“+”)  or  strong  (black  box,  white  “+”)  phenotypes.  For  Emb,  weak  is  10-80% 
embryonic  lethality,  strong  is  90%  embryonic  lethal  or  more;  weak  Ste 
denotes  a  brood  size  of  1  to  10,  whereas  strong  Ste  is  totally  sterile. 

Column  H  shows  whether  there  is  a  match  (white  box,  black  “+”)  or  a 
homologue  (black  box,  white  “+”)  in  Drosophila  melanogaster, 
Saccharomyces  cerevisiae  or  humans.  Phenotypic  abbreviations  are  given 
in  legend  to  Table  1.  The  GenePairs  name  does  not  always  correspond  with 
the  current  predicted  gene  name  since  gene  predictions  change. 

Figure  1  Conservation  of  genes  with  an  RNAi  phenotype.  Matches  or 
homologues  of  C.  elegans  genes  were  identified  as  described  in  the  text. 
Percentages  of  all  genes  (blue  bars)  or  genes  with  RNAi  phenotypes  (red 
bars)  with  matches  or  homologues  in  S.  cerevisiae  (SC),  D.  melanogaster 
(DM),  humans  (HS),  all  three  combined  (ALL),  or  with  no  matches  in  any 
organism  (NO  M)  are  shown.  The  significance  of  the  differences  between 
the  percentages  of  genes  and  the  percentages  of  genes  with  RNAi 
phenotypes  that  have  homologues  is  p<0.001  for  all  cases  except  for 
comparison  to  human  homologues  for  which  it  is  p<0.1. 
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Figure  2  Distribution  on  Chromosome  I  of  genes  with  RNAi  phenotypes 
and  genes  with  ESTs.  In  each  panel,  chromosome  I  was  analysed  in  10 
consecutive  portions,  each  containing  10%  of  predicted  genes,  a)  the 
percentage  of  all  genes  with  an  RNAi  phenotype  that  are  in  each  portion,  b) 
the  percentage  of  all  predicted  genes  that  have  an  EST  (blue  bars)  or  of 
genes  that  gave  an  RNAi  phenotype  that  have  an  EST  (red  bars)  in  each 
portion,  c)  the  percentage  of  Emb  genes  (black  bars)  or  genes  with  viable 
post-embryonic  phenotypes  (pink  bars)  in  each  chromosomal  portion.  The 
boxes  labelled  “dup  region”  show  the  approximate  location  of  regions 
containing  local  duplications. 

Figure  3  Functional  classes  of  Emb,  Ste,  Unc  and  Pep  genes.  Predicted 
products  of  genes  that  gave  Ste,  Emb,  Unc  or  viable  post-embryonic  (Pep) 
RNAi  phenotypes  were  placed  into  functional  classes  as  described  in 
Methods.  Genes  whose  products  could  not  be  accurately  classified  into  any 
of  the  8  functional  classes  were  placed  into  the  unknown  category  (white). 
Numbers  denote  the  percentage  of  genes  in  each  functional  class;  pie 
charts  illustrate  these  numbers  graphically,  b)  Pie  charts  show 
distributions  of  predicted  gene  products  grouped  as  follows:  basal 
metabolic  category  (red)  comprises  the  classes  of  DNA,  RNA,  protein  and 
intermediate  metabolism;  specialized  functions  (blue)  comprises  cell  cycle 
and  chromosome  dynamics,  cell  biology  and  cellular  structure,  gene 
specific  transcription  factors  and  signal  transduction.  Worms  show  the 
tissue  affected  in  each  phenotypic  class  shaded  in  grey,  c)  Distribution  of 
genes  giving  rise  to  non-viable  RNAi  phenotypes  in  C.  elegans  (worm)  or  to 
non-viable  phenotypes  following  disruption  in  S.  cerevisiae  (yeast). 

Supplementary  Table  1  Phenotypes  arising  from  RNAi  of  genes  on 
Chromosome  I.  Genes  that  have  a  detectable  RNAi  phenotype  are  grouped 
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by  the  functional  classes  shown  in  Fig  3.  For  each  gene,  the  following  data 
are  shown:  the  Research  Genetics  GenePairs  name;  whether  the 
sequence  might  target  a  number  of  paralogues  (asterisk  in  column  2; 
methods  gives  criterion  for  this);  the  corresponding  genetic  locus  name  if  it 
exists;  a  short  description  of  gene  function;  the  RNAi  phenotype  in  which 
embryonic  lethality  (Emb),  fecundity  (Ste),  post-embryonic  phenotypes  (P1- 
3)  and  developmental  delay  (Dev)  are  shown  separately;  existence  of 
matches  (lower  case)  or  homologues  (filled  box,  white  upper-case  text)  in  C. 
elegans  (CE),  Drosophila  melanogaster  (DM),  Saccharomyces  cerevisiae 
(SC)  or  humans  (HS);  and  whether  or  not  the  gene  has  an  EST  (E). 
Abbreviations  used  are  Ste  (sterile),  1-5  (fed  worm  had  1-5  progeny),  6-10 
(fed  worm  had  6-10  progeny),  Stp  (progeny  sterile),  Lvl  (larval  lethality),  Unc 
(uncoordinated),  Pvl  (protruding  vulva),  Bmd  (body  morphological  defects), 
Dpy  (dumpy),  Clr  (clear),  Him  (high  incidence  of  males),  Rup  (ruptured),  Mlt 
(molt  defects),  Prz  (paralyzed),  Sma  (small),  Egl  (egg-laying  defective),  Sck 
(sick),  Bli  (blistering  of  cuticle),  Muv  (multivulva),  Rol  (roller),  Adi  (adult 
lethal),  Lon  (long),  Hya  (hyperactive),  Gro  (slow  post-embryonic  growth)  and 
Lva  (larval  arrest).  “Mult”  indicates  that  the  gene  has  multiple  equal- 
penetrance  post-embryonic  phenotypes.  If  the  dsRNA  overlaps  multiple 
adjacent  genes  of  different  function,  these  appear  in  the  ’’Multiple  genes” 
category.  The  GenePairs  name  does  not  always  correspond  with  the 
current  predicted  gene  name  since  gene  predictions  change. 

Supplementary  Table  2  RNAi  phenotypes  for  previously  identified  loci  on 
Chromosome  I.  Columns  1  and  2  give  genes  on  chromosome  I  with 
previously  identified  embryonic  lethal  or  post-embryonic  phenotypes  and  the 
GenePairs  primer  pair  that  amplifies  a  fragment  overlapping  that  gene, 
respectively  .  “Mutant  Phenotype”  gives  the  published  phenotype.  RNAi 
phenotype  headings:  “Emb”  (percentage  embryonic  lethality);  “Ste”  (sterility); 
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“PI”,  “P2”  and  “P3”  (post-embryonic  phenotypes);  and  “Dev”  (slow  or 
arrested  growth).  “Hit”  indicates  whether  an  RNAi  phenotype  was  obtained 
in  the  initial  screen  (tick),  whether  no  mutant  phenotype  was  obtained  (“o”) 
or  whether  a  mutant  phenotype  was  obtained  in  separate  feeding 
experiment  (”*”).  Phenotype  abbreviations  are  given  in  the  Supplementary 
Figure  1  legend  with  the  following  additions:  Slu  (sluggish),  Vul  (vulvaless), 
Mec  (mechanosensory  abnormality),  Daf  (dauer  larva  formation  abnormal), 
Ttx  (thermotaxis  abnormal),  Che  (chemotaxis  defective).  Also,  the  following 
abbrevations  are  used:  phen  (phenotype),  migr  (migration),  red  (reduced), 
wk  (weak),  abnl  (abnormal),  and  dk  (dark).  Genes  with  null  phenotypes  that 
we  would  have  failed  to  detect  in  our  screen  are  shaded  in  light  grey;  genes 
that  we  failed  to  clone,  and  therefore  failed  to  analyse,  are  shaded  in  dark 
grey. 

Supplementary  Table  3  GenePairs  on  Chromosome  I  for  which  no  clone 
was  obtained. 
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Tabic  1  Summary  of  phenotypes  arising  from  RNAi  of  genes  on  Chromosome  I 


PHENOTYPE  NUMBER  PERCENT 


All  phenotypes 

TOTAL 

339 

13.9 

Embryonic  lethal 

Emb 

226 

9.2 

Sterile 

Ste 

82 

3.4 

Stp 

14 

0.6 

Developmental  delay 

Gro/Lva 

145 

5.9 

Larval  lethal 

Lvl 

38 

1.6 

Post-embryonic 

Unc 

70 

2.9 

Pvl 

29 

1.2 

Bmd 

27 

1.1 

Dpy 

19 

0.8 

Clr 

14 

0.6 

Him 

13 

0.5 

Rup 

9 

0.4 

Mlt 

8 

0.3 

Prz 

8 

0.3 

Sma 

6 

0.2 

Egl 

5 

0.2 

Sck 

5 

0.2 

Bli 

4 

0.2 

Muv 

2 

0.1 

Rol 

2 

0.1 

Adi 

1 

<0.1 

Lon 

1 

<  0.1 

Hya 

1 

<0.1 

Table  2  Detection  of  forward  genetic  loci  on  Chromosome  I  by  RNAi. 


Phenotype 

Genetic 
loci  fed 

Possible  to 
detect 

RNAi 

phenotype 

detected 

Published 

phenotype 

detected 

All  phenotypes 

62 

50 

31 

25 

Embryonic  lethal 

21 

21 

19 

16 

Sterile 

3 

3 

2 

2 

Sterile  progeny 

4 

4 

1 

1 

Developmental  delay 

0 

0 

- 

- 

Larval  lethal 

4 

4 

1 

1 

Post-embryonic 

43 

31 

14 

9 

Table  3  Partial  list  of  RNAi  phenotypes  of  genes  on  Chromosome  I 


PERCENTAGE 


*  1  J 


Fraser  et  al. 


Fig2 


a 


dup  region  1  dup  region  2 


b 


100 


dup  region  1  dup  region  2 


dup  region  1 


dup  region  2 


Fraser  et  al. 


Fig  3 


a 


Ste 

Emb 

Unc 

Pep 

DNA  synthesis 

■ 

1.2 

1.3 

0.0 

2.2 

RNA  metabolism 

■ 

4.8 

11.8 

4.7 

8.9 

Protein  metabolism 

□ 

44.6 

22.3 

7.8 

4.4 

Energy/metabolism 

■ 

10.8 

10.9 

6.3 

2.2 

Chrme  dynamlcs/cell  cycle 

■ 

0.0 

6.1 

0.0 

2.2 

Cell  structure/organisation 

■ 

18.1 

15.7 

26.6 

15.6 

Specific  transcription 

■ 

7.2 
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